Optimizing Large Join Queries Using A Graph-Based Approach
نویسندگان
چکیده
ÐAlthough many query tree optimization strategies have been proposed in the literature, there still is a lack of a formal and complete representation of all possible permutations of query operations (i.e., execution plans) in a uniform manner. A graph-theoretic approach presented in this paper provides a sound mathematical basis for representing a query and searching for an execution plan. In this graph model, a node represents an operation and a directed edge between two nodes indicates the order of executing these two operations in an execution plan. Each node is associated with a weight and so is an edge. The weight is an expression containing optimization required parameters, such as relation size, tuple size, join selectivity factors. All possible execution plans are representable in this graph and each spanning tree of the graph becomes an execution plan. It is a general model which can be used in the optimizer of a DBMS for interal query representation. On the basis of this model, we devise an algorithm that finds a near optimal execution plan using only polynomial time. The algorithm is compared with a few other popular optimization methods. Experiments show that the proposed algorithm is superior to the others under most circumstances. Index TermsÐLarge join query, graph theory, join cost, query tree, query optimization.
منابع مشابه
[4] Chiang Lee, Chi-Sheng Shih, and Yaw-Huei Chen. Optimizing large join queries using a graph-based approach. IEEE Trans. Knowl. Data Eng., 13(2):298–315, 2001.
References [1] Leonidas Fegaras. A new heuristic for optimizing large queries. [2] Toshihide Ibaraki and Tiko Kameda. On the optimal nesting order for computing n-relational joins. Optimizing large join queries using a graph-based approach. [5] Guido Moerkotte and Thomas Neumann. Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees wi...
متن کاملIEEE Trans. Knowl. Data Eng., 13(2):298–315, 2001.
References [1] Leonidas Fegaras. A new heuristic for optimizing large queries. [2] Toshihide Ibaraki and Tiko Kameda. On the optimal nesting order for computing n-relational joins. Optimizing large join queries using a graph-based approach. [5] Guido Moerkotte and Thomas Neumann. Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees wi...
متن کاملAn Intermediate Algebra for Optimizing RDF Graph Pattern Matching on MapReduce
Existing MapReduce systems support relational style join operators which translate multi-join query plans into several Map-Reduce cycles. This leads to high I/O and communication costs due to the multiple data transfer steps between map and reduce phases. SPARQL graph pattern matching is dominated by join operations, and is unlikely to be efficiently processed using existing techniques. This co...
متن کاملOptimizing Large Query by Simulated Annealing Algorithm Based On Graph-Based Approach
In the relational database setting today, large queries containing many joins are becoming increasingly common. In general the ordering of join-operations is quite sensitive and has a devastatingly negative effect on the efficiency of the DBMS. Scheufele and Moerkotte proved that join-ordering is NP-complete in the general case [1]. The dynamic programming algorithm has a worst case running tim...
متن کاملGraph summaries for optimizing graph pattern queries on RDF databases
The adoption of the Resource Description Framework (RDF) as a metadata and semantic data representation standard is spurring the development of high-level mechanisms for storing and querying RDF data. A common approach for managing and querying RDF data is to build on Relational/Object Relational Database systems and translate queries in an RDF query language into queries in the native language...
متن کاملRelational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Knowl. Data Eng.
دوره 13 شماره
صفحات -
تاریخ انتشار 2001